Assessing treatment efficacy for interval-censored endpoints using multistate semi-Markov models fit to multiple data streams
Jon Fintzi
Statistical Methodology and Innovation, Bristol Myers Squibb
August 4, 2025
REGEN-2069 trial of mAb as COVID-19 prophylaxis
Monoclonal antibody (mAb), REGEN-COV, for prevention of COVID-19.
Primary endpoint was symptomatic infection within 28 days.
SARS-CoV-2 naïve unvaccinated participants enrolled within 96 hours of household index case.
Randomized 1:1 to mAb vs. placebo.
Monitored continuously for symptoms. Also, weekly nasopharyngeal swabs for RT-qPCR and serology for anti-nucleocapsid antibodies at 28 days.
- PCR indicates ongoing viral shedding.
- Positive serology is a marker of immune response to infection.
81.4% reduction in risk of symptomatic infection (odds ratio, 0.17; 95% CI, 0.09 to 0.33; p < 0.001).
Goals in secondary analysis:
- Protective efficacy (PE) against infection (symptomatic + asymptomatic).
- Cumulative incidence of infection over the 28 day study period.
- Seroconversion following infection.
- Duration of detectable viral shedding.
Difficulty: infection is not continuously observed.
Note: From now on, infection = participant is measureably affected by infection.
Data assimilation
Strategy: combine PCR, symptom, and serology data.
- No one data stream completely captures infection.
- More kicks at the can to detect infections.
- Scientific collaborators use their clinical and immunological expertise to help us formulate a model.
Modeling multiple data streams (big picture)
Study participants transition through discrete states of infection and immune response:
- 1 = Infection naïve,
- 2 = PCR+, no history of symptoms,
- 3 = No longer PCR+, no history of symptoms,
- 4 = PCR+ with history of symptoms,
- 5 = No longer PCR+ with history of symptoms.
Idea: PCR, symptoms, and serology are breadcrumbs about each participant’s trajectory.
Here’s the problem
Data are a coarse reflection of a latent biological process in continuous time.
- Coarse data with different temporal resolutions and complicated censoring.
- Incomplete identification of state labels at observation times.
- Likelihood is a product of transition probabilities over inter-observation intervals.
Marginal likelihoods for semi-Markov processes integrate over the number + timing of unobserved state transitions.
- Tractable for simple progressive processes, very difficult to evaluate in general.
- No Kolmogorov forward equation as in the Markov case.
Methodological contribution
Recall, in the expectation-maximization (EM) algorithm we alternate between:
E-step: calculate the expected complete data log-likelihood, i.e., Q-function, to average over missing data.
M-step: maximize the Q-function.
Wash, rinse, repeat until convergence to obtain an MLE.
Key innovation: Monte Carlo expectation-maximization framework for fitting multistate semi-Markov models to coarsened data.
E-step: approximate Q-function via Monte Carlo – average log-likelihood over paths that are sampled conditionally on the data.
M-step: maximize the Q-function.
Wash, rinse, repeat until convergence to obtain an MLE.
But how should we sample the paths?
Our proposal:
- Sample latent paths using a Markov surrogate + standard algorithms for HMMs and endpoint conditioned Markov chains.
- Model agnostic algorithm - can accomodate complex model structure, semi-parametric intensities, and coarse data.
- Details in the manuscript, available on ArXiv: https://arxiv.org/abs/2501.14097.
Models
Fit 6 models varying in their flexibility.
Results
We choose the second most flexible model by AIC.
- Gain 100 units of log-likelihood over time-homogeneous Markov model.
- Gain 78.7 units of log-likelihood over parametric Weibull intensity model.
Results
Incidence of symptomatic infection (left) and all-comer infection (right)
- Good fit for incidence of symptomatic infection (continuously observed).
- Large gap in week 1 between observed infections and smooth estimate reflects early infections not caught until first PCR or symptom onset.
- Persistent discrepancy is short shedders not captured by weekly PCR.
Key takeaways from the analysis:
Multifaceted benefit of mAb prophylaxis:
Lower rate of seroconversion following infection.
- RR of seroconversion = 31.9% (95% CI: 22.3%, 44.6%).
- Consistent with less intense immune response to lower viral loads.
Shortened duration of detectable viral shedding:
- mAb: 6.2 days (95% CI: 5.0 days, 7.8 days),
- Placebo: 13.0 days (95% CI: 11.5 days, 14.6 days).
Wrapping up
Contributions on methodology and implementation
- Fit semi-Markov models to data with complex coarsening patterns.
- Algorithm is agnostic of model structure, but not a panacea for non-identifiability where it can fail loudly (a strength).
- Can accomodate splines + other flexible functions.
- Flexible and general impementation in R/Julia.
Methodological extensions (“low-hanging fruit”)
- Disease driven observation schemes + preferential sampling.
- Penalized splines and approximate cross-validation approximation for semi-parametric inference with automatic smoothing.
- Fast robust uncertainty quantification.
- Phase-type proposal distributions.
Please reach out if you want to collaborate on an analysis or methods! 😊 (jonathan.fintzi@bms.com)
Thank you!
And also thanks to my excellent collaborators!
- Raphaël Morsomme (FDA),
- C. Jason Liang, Allyson Mateja, Dean Follmann (NIAID, NIH),
- CG Wang, Meagan O’Brien (Regeneron).
Crude tabulation of participant outcomes
Comparison with phase-type models
Setup:
- Recurrent illness-death model, monthly observations over 1 year, death observed exactly. N = 1000.
- Health -> Ill is Weibull with increasing intensity, other transitions exponential.
- Models: time-homogeneous Markov, phase-type with 2 latent states for healthy -> ill transition, semi-Markov with spline for healthy -> ill transition (degree 1, interior knot at 0.5 chosen arbitrarily).
Efficiency vs. rejection sampling
Compute effort to obtain complete trajectories for a single transition. A&B = rejection sampler from Aralis & Brookmyer (2019).